Conversation
|
noting this might make the read_beyond_eof handling code unreachable (iirc we added that when we speculated sources of eof tracking being wrong in ways we couldn't reproduce). |
There was a problem hiding this comment.
I like this change, it simplifies things quite a bit. We'll have to check a bit more thoroughly but a preliminary quick bench shows the extra syscall costs about 3-5 microseconds
macos/intel
>element(1,timer:tc(fun() -> [file:position(Fd, eof) || I <- lists:seq(1,1000000)], ok end))/1000000.
3.739081
debian trixie/intel
f(Fd), {ok, Fd} = file:open("/var/tmp/f.txt", [binary, append, create,raw]).
(dbcore@db2.relengtest001.cloudant.net)5> element(1,timer:tc(fun() -> [file:position(Fd, eof) || I <- lists:seq(1,1000000)], ok end))/1000000.
4.243807
Another thing to keep an eye on is if this could acquire any extra locks or affect concurrency. Say would 10k couch_file doing an extra lseek call hit any possible bottleneck in the os layer (I would wager, not but worth thinking about)
e2300c5 to
4c3fa2e
Compare
nickva
left a comment
There was a problem hiding this comment.
+1 with one minor optimization to reduce syscall count for pread batches
39b1223 to
bc656f7
Compare
|
I ran a few k6-couch benchmark runs against main (100k docs, doc get and doc insert for 3 minute) Main: No Eof (PR) The difference is there but is tiny: Medians for doc_get Medians for doc_insert: So maybe 0.1-0.5 msec difference P90 for doc_get: P90 for doc_insert: About the same for reads and <1msec difference for writes |
24e7873 to
bd3afbf
Compare
Overview
Manually tracking is error-prone. e.g, file:write might return
okbut not have written all bytes due to buffering. couch_file advances the eof anyway, so the results of subsequent writes before a datasync will return the wrong position. calling file:position/2 is very cheap, so just do that when we're writing.Testing recommendations
Covered by eunit tests.
Related Issues or Pull Requests
N/A
Checklist
rel/overlay/etc/default.inisrc/docsfolder